首页> 外文OA文献 >Adaptive multimodal fusion by uncertainty compensation with application to audiovisual speech recognition

【2h】

Adaptive multimodal fusion by uncertainty compensation with application to audiovisual speech recognition

机译：不确定性补偿的自适应多模态融合及其在视听语音识别中的应用

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

While the accuracy of feature measurements heavily depends on changing environmental conditions, studying the consequences of this fact in pattern recognition tasks has received relatively little attention to date. In this paper, we explicitly take feature measurement uncertainty into account and show how multimodal classification and learning rules should be adjusted to compensate for its effects. Our approach is particularly fruitful in multimodal fusion scenarios, such as audiovisual speech recognition, where multiple streams of complementary time-evolving features are integrated. For such applications, provided that the measurement noise uncertainty for each feature stream can be estimated, the proposed framework leads to highly adaptive multimodal fusion rules which are easy and efficient to implement. Our technique is widely applicable and can be transparently integrated with either synchronous or asynchronous multimodal sequence integration architectures.We further show that multimodal fusion methods relying on stream weights can naturally emerge from our scheme under certain assumptions; this connection provides valuable insights into the adaptivity properties of our multimodal uncertainty compensation approach.We show how these ideas can be practically applied for audiovisual speech recognition. In this context, we propose improved techniques for person-independent visual feature extraction and uncertainty estimation with active appearance models, and also discuss how enhanced audio features along with their uncertainty estimates can be effectively computed. We demonstrate the efficacy of our approach in audiovisual speech recognition experiments on the CUAVE database using either synchronous or asynchronous multimodal integration models. © 2009 IEEE.

机译：尽管特征测量的准确性在很大程度上取决于不断变化的环境条件，但迄今为止在模式识别任务中研究这一事实的后果却鲜为人知。在本文中，我们明确考虑了特征测量的不确定性，并展示了应如何调整多峰分类和学习规则以补偿其影响。我们的方法在多模态融合场景（例如视听语音识别）中特别富有成果，在该场景中，多个互补的随时间变化的特征流被集成在一起。对于此类应用程序，只要可以估计每个特征流的测量噪声不确定性，则所提出的框架将导致高度自适应的多模态融合规则，该规则易于实现且高效。我们的技术具有广泛的适用性，可以与同步或异步多峰序列集成架构透明地集成。这种联系为我们的多模式不确定性补偿方法的适应性特性提供了宝贵的见解。我们展示了如何将这些想法实际应用于视听语音识别。在这种情况下，我们提出了用于具有活动外观模型的独立于人的视觉特征提取和不确定性估计的改进技术，并讨论了如何有效地计算增强的音频特征及其不确定性估计。我们使用同步或异步多模式集成模型在CUAVE数据库上的视听语音识别实验中证明了我们的方法的有效性。 ©2009 IEEE。

著录项

作者
Papandreou, G; Katsamanis, A; Pitsikalis, V; Maragos, P;
展开▼
作者单位

展开▼
年度 2009
总页数
原文格式 PDF
正文语种 English
中图分类

相似文献

外文文献
中文文献
专利

1. Adaptive Multimodal Fusion by Uncertainty Compensation With Application to Audiovisual Speech Recognition [J] . Papandreou G., Katsamanis A., Pitsikalis V., Audio, Speech, and Language Processing, IEEE Transactions on . 2009,第3期

机译：不确定性补偿的自适应多峰融合技术在视听语音识别中的应用
2. A simplified audiovisual fusion model with application to large-vocabulary recognition of French Canadian speech [J] . L. Gagnon, S. Foucher, F. Laliberte, Canadian journal of electrical and computer engineering . 2008,第2期

机译：一种简化的视听融合模型，应用于加拿大法语语音的大词汇量识别
3. Multimodal information fusion application to human emotion recognition from face and speech [J] . Muharram Mansoorizadeh, Nasrollah Moghaddam Charkari Multimedia Tools and Applications . 2010,第2期

机译：多峰信息融合技术在人脸表情识别中的应用
4. MULTIMODAL FUSION BY ADAPTIVE COMPENSATION FOR FEATURE UNCERTAINTY WITH APPLICATION TO AUDIOVISUAL SPEECH RECOGNITION [C] . Athanassios Katsamanis, George Papandreou, Vassilis Pitsikalis, European Signal Processing Conference;EUSIPCO . 2006

机译：自适应补偿的多模态融合技术在音频语音识别中的应用
5. Multimodal fusion with applications to audio-visual speech recognition. [D] . Chu, Stephen Mingyu. 2003

机译：多模式融合及其在视听语音识别中的应用。
6. Lipreading and Audiovisual Speech Recognition across the Adult Lifespan: Implications for Audiovisual Integration [O] . Nancy Tye-Murray, Brent Spehar, Joel Myerson, -1

机译：成人寿命中的唇读和视听语音识别：对视听整合的启示
7. Adaptive Multimodal Fusion by Uncertainty Compensation With Application to Audiovisual Speech Recognition [O] . George Pap, Student Member, Athanassios Katsamanis, 2008

机译：不确定性补偿的自适应多模态融合及其在视听语音识别中的应用

Adaptive multimodal fusion by uncertainty compensation with application to audiovisual speech recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅